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APPARATUS, SYSTEM, AND METHOD FOR VIDEO ENCODER RATE CONTROL 

FIELD OF THE INVENTION 
[0001] The present invention generally relates to rate control of video compression 
encoders. More particularly, the present invention relates to constant bit rate (CBR) and 
variable bit rate (VBR) control for block-based video encoding, including but not limited to 
MPEG compatible video encoding. 

BACKGROUND OF THE INVENTION 
[0002] Video compression is commonly used to reduce the data storage and or 
transmission requirements of a recorded video stream. For example, the Motion Picture 
Experts Group (MPEG) standards define several commonly used video compression 
standards. 

[0003] MPEG-1 is intended for progressive video and is commonly used to store 
video on compact discs, such as Video Compact Disc (VCD). The MPEG-1 standard defines 
a group of pictures (GOP). Referring to prior art Figure 1, each GOP commences with an 
intra-coded picture frame, I. Motion compensated predictive feedback is used to compress 
subsequent inter-coded frames, P. Bidirectionally predicted frames, B, are coded using 
motion compensated prediction based on both previous and successive I or P frames. MPEG- 
2 adds compression support for interlaced video content. 

[0004] MPEG video compression divides each individual frame into regions called 
macroblocks. Individual macroblocks may be predicted from neighboring frames. A discrete 
cosine transform (DCT) is applied to the frame to compress the frame. The resulting DCT 
coefficients for each macroblock are then quantized. A variable length encoder is used to 
encode the data. 

[0005] A rate controller is used to select the quantization step size, which for a given 
image complexity will determine the bit rate. In addition, the quality of the image also 
depends upon the quantization step size. Conventionally, the bit rate, R, is modeled 

X 

according to the expression: R = — , where X is the total image complexity and Q is a 
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quantization step size. Thus, the bit-rate, quantization step size, image complexity, and 
image quality are inter-related. 

[0006] The tradeoffs in bit rate/quality that conventional MPEG encoders make is not 
as sophisticated as desired. For many applications, conventional MPEG encoders do not 
provide a fine enough level of control, particularly for single-pass MPEG encoders used in 
real time systems. 

[0007] Therefore, what is desired is an improved apparatus, system, and method for 
rate control in an MPEG encoder. 

SUMMARY OF THE INVENTION 
[0008] A programmable rate controller for a video compression encoder is disclosed. 
In one embodiment, the programmable rate controller includes a variable bit rate encoder 
generating a first quantization step size, a constant bit rate encoder generating a second 
quantization step size, and a selector for selecting a maximum permissible quantization step 
size. 

[0009] The variable bit controller has a target peak bit rate and a target average bit 
rate. In one embodiment, the variable bit rate controller adjusts the quantization step so that 
the average bit rate of the output bitstream of the encoder tracks the target average bit rate. In 
some embodiments, a proportional integral control technique is used to track the target 
average bit rate according to a selectable time constant. 

[0010] In one embodiment, the constant bit rate controller determines a statistical 
frequency of macroblock types within a current picture, generates a statistical indicator 
indicative of a complexity of each type of macroblock, predicts picture complexity using the 
statistical frequency of macroblock types and the statistical indicator of macroblock type 
complexity, generates a bit allocation consistent with the predicted picture complexity, and 
assigns a quantizer step size consistent with the bit allocation. 

BRIEF DESCRIPTION OF THE FIGURES 
[0011] The invention is more fully appreciated in connection with the following 

detailed description taken in conjunction with the accompanying drawings, in which: 

[0012] Figure 1 is prior art drawing illustrating a MPEG group of pictures (GOP). 
[0013] Figure 2 is a block diagram of a video compression encoder in accordance 

with one embodiment of the present invention. 
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[0014] Figure 3 is a block diagram of a CBR rate controller in accordance with one 
embodiment of the present invention. 

[0015] Figure 4 is a block diagram of a CBR bit allocator in accordance with one 
embodiment of the present invention. 

[0016] Figure 5 is a block diagram of a dual CBR/VBR rate controller in accordance 
with one embodiment of the present invention. 

[0017] Figure 6 is a block diagram of a core VBR rate controller in accordance with 
one embodiment of the present invention. 

[0018] Figure 7 is a block diagram of an equivalent model of a VBR rate controller 
for one set of conditions. 

[0019] Figure 8 is a plot illustrating exemplary quantizer step-size behavior for the 
dual VBR/CBR rate controller. 

Like reference numerals refer to corresponding parts throughout the several views of 
the drawings. 

DETAILED DESCRIPTION OF THE INVENTION 
[0020] Figure 2 is a block diagram of a video compression encoder 200 in accordance 
with one embodiment of the present invention. It will be understood that the video 
compression encoder is adapted to receive video images and encode the video images to 
generate an output bit stream in compliance with a block-based video compression standard 
such as MPEG-1, MPEG-2, MPEG-4, or H.264, etc. 

[0021] Encoder 200 includes an MPEG motion estimation module 210, a macroblock 
coding decision module 220, a transform module 230 to perform a discrete cosine transform 
(DCT), a quantization module 240 to quantize the compressed DCT coefficients on a 
macroblock-per-macroblock basis according to a quantization step size, a variable length 
encode module 250 for encoding compressed image data into an output bitstream, and a 
programmable rate controller 260 for selecting the quantization step size. In one 
embodiment, quantization module 240 includes a virtual quantizer scale that takes on values 
from 2 to any arbitrarily high value, e.g., 512. An encoder video bitstream verification 
(VBV) buffer (not shown) may be included in the encoder. A VBV buffer is a model 
hypothetical decoder buffer used to determine potential decoder buffer underflow and 
overflow conditions. It is desirable that the bitstream remain VBV compliant such that a 
corresponding decoder does not suffer a deleterious underflow or overflow condition. 
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[0022] Programmable rate controller 260 is programmed to balance the short-term 
and long-term output bit production of the encoder against the video quality of the resulting 
decoded pictures. For MPEG-1 and MPEG-2, this balance may be accomplished by setting 
the quantization step size of the DCT coefficients on a macroblock-by-macroblock basis to 
attempt to avoid deleterious decoder buffer states that degrade image quality (e.g., underflow 
or overflow for constant bit rate encoding). 

[0023] In one embodiment, programmable rate controller 260 includes a variable bit 
rate (VBR) rate controller 280, a constant bit rate (CBR) rate controller 290, and a selector 
(not shown) for selecting an output from either the VBR rate controller 280 or the CBR rate 
controller 290. In one embodiment, the selector 510 picks the rate controller having the 
largest quantization step size as the output. VBR rate controller 280 permits a variable bit 
rate mode of operation. CBR rate controller 290 permits a constant bit rate mode of 
operation. For CBR rate controller 290 the target average and target peak bit rates are the 
same. For VBR rate controller 280 the target average and target peak bit rate may be set 
independently. 

[0024] As described below in more detail, in one embodiment CBR rate controller 
290 classifies macroblock types, generates energy estimates of macroblock types, and creates 
a complexity estimate from macroblock statistics, from which a target bit rate is estimated. 
Additionally, as described below in more detail, in one embodiment the VBR rate controller 
280 creates a target bit allocation by measuring changes in the average bit rate of the output 
bitstream over time, e.g., by tracking instantaneous and cumulative deviations between the 
actual and target long-term average bit rates and re-adjusting the target bit allocation 
accordingly. 

[0025] The mode of operation (CBR or VBR) of programmable rate controller 260 
will depend upon parameter constraints input to programmable rate controller 260 and a 
logical condition selected for the arbitration logic to choose either CBR or VBR. This 
permits, for example, the mode of operation to be selected to be entirely CBR, entirely VBR, 
or to switch back and forth between CBR and VBR depending upon the complexity of the 
picture frames that are being encoded and other parameters that are selected. As a result, 
programmable rate controller 260 has a response that may be adapted for different encoding 
applications by selecting the value of parameter constraints. 

[0026] Programmable rate controller 260 includes a parameter select input 215 for 
defining parameters to adjust the function of programmable rate controller 260. As described 
below in more detail, certain constraints such as the size of the video bitstream verification 

4. 



(VBV) buffer and the peak rate may be selected to guarantee MPEG-2 compliance and/or 
playback on a specified device, such as a VCD or DVD player. Other constraints, such as 
target long-term average bit-rate may be imposed so that applications can predict and/or pre- 
allocate the size of the output bitstream prior to encoding. Examples of programmable rate 
control parameters include a target average bit rate, Ra Vg ; a maximum bit rate, Rp ea k> 
corresponding to a maximum bit rate specified in the header of the bitstream used by video 
bitstream verification model; a bit rate time constant, 7, for adjusting VBR operation to 
deviations in average bit rate; a VBV buffer size, B vbv in bits; a target quantizer scale, Qtarget 
for all macroblocks used by the VBR rate controller; an initial quantizer scale, Q 0 for the 
VBR rate controller; a minimum quantizer scale value, Q m i n , a lower bound on the target VBR 
quantizer scale value; and a maximum quantizer scale value, Qmax, an upper bound on target 
VBR quantizer scale value for a picture. Additionally, other parameters, such as a dither 
update period, and a picture weighting factor may be selected. In one embodiment, if a 
constant rate flag is set, a VBV-delay field of the picture will be encoded with a non-OxFFFF 
value for MPEG-2 bitstreams, resulting in true MPEG CBR streams with zero stuffing. It 
will also be understood that enable/disable signals may be included to enable or disable the 
CBR rate controller or the VBR rate controller. Some of these parameters are further 
described in Appendix 1, along with some of the associated limitations in independently 
setting these parameters caused by the inter-relationship of bit rate, quantizer size, image 
quality, and image complexity. 

[0027] Referring to Figure 3, in one embodiment CBR rate controller 290 includes a 
picture analysis module 310, a complexity model module 320, a bit allocation module 330, 
and a picture-level quantizer assignment module 340. CBR rate controller 290 strives for 
consistent video quality over a rolling window of N future pictures, where TV is a multiple of 
the GOP size. The requirement for constant bit-rate is achieved implicitly by preventing 
overflow and underflow in the output bit buffer (the VBV buffer for MPEG compliant 
encoding). Because the VBV bit buffer is filled at a constant rate once per picture, the 
bitstream is guaranteed to be CBR compliant if the VBV buffer does not overflow or 
underflow. The CBR rate controller 290 predicts the relationship between rate and quantizer 
step size based on statistics in the current picture and on the observed relationship between 
rate and quantizer step-size in previously encoded pictures. Based on these relationships, bits 
are allocated for the current picture with the goal of maintaining constant quality over the 
next N pictures. 
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[0028] It will be understood that the programmable rate controller implements 
separate rate-quantization models for quantization-dependent and quantization-independent 
bits. Quantization-dependent bits are encoded bits that vary directly with the quantization 
step size. For intra blocks, quantization-dependent bits are those bits resulting from the 
encoding of the AC DCT coefficients. For non-intra blocks, quantization-dependent bits are 
those bits resulting from the encoding of all DCT coefficients. In both cases, quantization- 
dependent bits exclude bits resulting from the encoding of motion vectors, headers, and 
skipped macroblocks. Quantization-independent bits are all non-quantization-dependent bits 
in a picture. The CBR rate controller 290 creates running estimates for the number 
quantization-independent bits in a picture independently for each picture type. The estimates 
are simply the output of a simple first-order infinite impulse response (IIR) filter operating on 
the past totals of quantization-dependent bits from pictures of the same type. 

[0029] Picture analysis module 310 classifies macroblocks by macroblock type and 
computes a statistical measure, called an energy value, indicative of the number of bits 
required to encode macroblocks of each type. Picture analysis module receives as inputs 
input image data 1, motion-compensated difference image data 2, and macroblock coding 
decision data 3 for picture i. A table listing some of the variables used in the rate 
quantization model is included in Appendix 2. A summary of some of the signals in the rate 
controller is included in Appendix 3. 

[0030] In one embodiment, input image data 1 is in the form of luminance values of 
each pixel, which can be expressed by the equation: {P xy {jj)\ j e j\ P xy (ij) corresponding 

to the set of luminance values P xy {i,j) of each pixel of row jc and column j> of macroblock j 

in the original input picture for the set, J, of macroblock indices in a picture. In one 
embodiment, motion compensated difference image data 2 can be expressed by the equation: 
{R Xy y{i 9 j)l j 6 j\ corresponding to the set of luminance values of each pixel corresponding to 

row x and column y of macroblock j in the difference image resulting from the motion 
compensation of picture / for the set J, of macroblock indices in a picture. 

[0031] For each input picture, picture analysis module 310 classifies the macroblocks 
by macroblock types having distinct rate-quantization properties. The macroblock types are 
classified to generate statistics regarding the frequency of macroblocks that have different 
rate-quantization properties. The set of possible macroblock types is specified by the set K of 
different macroblocks (where K has at least two members) and is based on the macroblock 
coding decisions, assumed here to have been made prior to the start of rate control by 
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macroblock coding decisions module 220. Examples of macroblock types in set K may 
include: intra blocks in an I-picture; intra blocks in a P- or B-picture; non-intra blocks in a P- 
picture without bi-directional motion compensation; non-intra blocks in a B-picture without 
bi-directional motion; and non-intra blocks in a B-Picture with bi-directional motion. 

[0032] The number of macroblocks of each type within the set, K, is counted. In one 
embodiment, macroblock counts, {<t> k (/); k e K}, are computed according to the equation: 



where 

fl if macroblock j in picture i is of type k 
1 0 otherwise 



and J is a set of indices referring to each of the macroblocks in the current picture. Next, the 
counts are normalized by the total number of macroblocks in the picture, resulting in a set of 
associated occurrence frequencies, $T k (j) ;keK}, given by the equation: 

[0033] Frequency measurements for the current picture are combined with past 
estimates to generate running frequency estimates, {r m k (i); k € K,m e Af), which may be 
calculated according to the equation: 



1 r «.*6'-l) otherwise' 



where n is the picture type for the current picture. T mk {i) may be used to estimate the 

probability with which a macroblock of type k will occur in a picture of type m. 

[0034] Picture analysis module 310 also generates statistical information indicative of 
the number of bits required to encode a macroblock of a particular type with a given 
quantizer step size. In one embodiment, CBR rate controller 290 uses a difference 
measurement, such as a mean absolute difference (MAD) measure, of each macroblock type 
as an activity measurement to calculate an energy value indicative of the number of bits 
required to encode a macroblock type with a given quantizer step size. In a MAD 
embodiment, a macroblock type with a comparatively large MAD value is presumed to 
require more bits to encode than a macroblock type with a smaller MAD value. 
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[0035] For an intra macroblock with index j in picture i the MAD value may be 
computed as the mean absolute difference between the original luminance pixels,/^ ty (i 9 j), 

and the mean pixel luminance of the macroblock, P(i,j) according to the equation: 

MAD intm (ij) = -i- • ZfiP^bj)- Hi, A • 
zoo x=l y=x 



[0036] For a non-intra macroblock with index j in picture i the MAD value is 
calculated as the mean absolute value of the luminance motion compensated differences, 
values R x y (i,j) according to the equation: 



J 16 16 

MA D non-intra (*> j) = ' £ Z \ R x,y A 



256 x=ly=l 

[0037] Picture analysis module 310 uses the MAD values to calculate an energy value 
for each macroblock type, with the energy value scaling the MAD value by an empirical 
factor to provide an approximate indication of image complexity. In one embodiment, an 
energy measure, {e k {i); k e K} 9 for each macroblock type is calculated by averaging the MAD 

values (raised to the power /3) over all macroblocks in each macroblock type according to the 
equation: 



%(0 = • Z &(ij)[MAD(i,j)Y . 



An exemplary value of (3 as determined from empirical investigations is /? = 1 .45 . 

[0038] Picture analysis module 310 also generates time averaged energy estimates, 
{e k (i); k g K} which are also useful for understanding complexity. Time averaged energy 
estimates may be expressed according to the following equation: 



and 



= a k (i) e'J? - 1)+ <M/) e t (i), 
O k (i) = a t (i)Q> k (i-l)+O k (i). 



[0039] Referring to Figure 3, in one embodiment picture analysis module also 
generates an intra energy output 8 for use by bit allocation module 330 to improve bit 
prediction inside a VBV compliance check. As described below in more detail, intra energy 
output 8 is used by bit allocation module 330 to help anticipate sudden changes in picture 
complexity that otherwise might lead to VBV underflow and overflow. Picture analysis 
module 310 measures the intra energy, E intra (i) t for the current picture by summing the 
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energies of the original pixels for each macroblock in the image. This measurement is useful 
because I- frames are typically 12 to 15 frames apart. This measurement is combined with 
previous intra energy estimates to generate a current energy estimate for I-pictures, which 
may be updated using the following first-order IIR filter equation: 

E intra (0 = <*(0 ' ^ intra (< " 0 + (l - «(<)) ' ^ intra (0 ■ 

[0040] Complexity model module 320 receives the macroblock classification and 
energy calculations information from picture analysis module 310 and measures the relative 
coding "complexities" for each of the macroblock types given by {x k (i); k e K} . In one 

embodiment, the complexity model module 320 models the complexity for a macroblock 
type m according to the equation: 

^k\ l ) j€J 

where 6(1,7') and q(i 9 j) are respectively the number of quantization dependent bits and the 
quantization scale used to encode macroblock j from picture i. 

[0041] Complexity model module 320 also forms running estimates of the 
macroblock type complexities. The macroblock complexities for the current picture are 
combined with past values to generate running estimates for the macroblock type 
complexities, {x A (i); k e K). A variety of factors may be used to create running estimates of 

the macroblock type complexities such as: including a contribution from all macroblocks of a 
macroblock type corresponding to a particular time instant; basing the contribution of a 
particular picture to the running-average complexity estimate for a particular macroblock 
type to be proportional to the number of macroblocks of that type in the picture; and 
statistically aging estimates such that the contribution of past macroblocks diminishes with 
time. 

[0042] In one embodiment, the following equations are used for computing the 
running-average estimates for complexity: 




and 

35 (0 = a k (0 • x& - 1) + <D, (1) • x k (1 ) , 
0 A (/) = a,(i).<D Jt (i-l)+a> Jk (i), 
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where i is the current picture index, a k (i) is the aging factor associated with macroblock type 

k 9 and O k {i) is the macroblock count for macroblock type k The normalizing term in the 

denominator, 0 Ar (z), guarantees that constant input results in constant output. 

[0043] Complexity model module 320 calculates an estimate the complexity of the 
picture, which as described below in more detail, can be used by bit allocation module to 
adjust the target bit rate. The actual complexity, X(j) , of the current picture (as determined 

after encoding) can be calculated from the encoding complexity, x k (i), of individual 

macroblocks, according to the equation: 

[0044] A predicted picture complexity can be calculated by substituting estimates of 
the macroblock complexities, {^(z); * G f° r ^ e actua ^ macroblock complexities, 
{x k (i); k e K} (which won't be known until after the picture is encoded). In addition, the 

macroblock complexities estimates are scaled for improved accuracy. The scale factors are 
the ratios of the actual macroblock energies and the macroblock energy estimates. Thus, the 

predicted picture complexity, X{i) , may be calculated according to the equation: 

where eo is a small constant (e.g., 0.5) that mitigates the effects of small energy values. 

[0045] The time averaged picture complexity is also estimated for each picture type. 
The complexities for each picture type, \X m {i)\ m e Af}, are synthesized directly from the 
macroblock type complexities and their corresponding frequencies according to: 

keK 

[0046] Bit allocation module 330 receives complexity model data 9 from complexity 
model module 320, intra-energy estimates 9 from picture analysis module 310, and VBV 
fullness data 6 from variable length encoder 250. The complexity model data is used to 
generate an estimate of an ideal target bit rate, which is then adjusted using the intra-energy 
estimates and VBV fullness data to maintain VBV fullness and compliance within acceptable 
limits. 

[0047] Referring to Figure 4, in one embodiment bit allocation module 330 includes 
an ideal bit allocation module 410, a VBV fullness adjustment module 420, and a VBV 
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compliance adjustment module 430. Ideal bit allocation module 410 starts with a nominal or 
"ideal" bit budget, B(i) for a forward-looking window, and adjusts it based on the difference 
between the actual VBV fullness and a picture-adjusted "ideal" VBV fullness, resulting 
in #'(/). Depending on the outcome of an iterative VBV compliance check, this budget may 
receive an adjustment to prevent VBV underflow. The final budget, B n (i), determines the bit 
allocation and subsequent quantizer assignment for the picture. 

[0048] In one embodiment, ideal bit allocation module 410 receives {X m (i); m e Af}; 
which is the estimated complexity for pictures of type m after encoding picture i and M is the 
set of picture types (I, P, or B). From this data, ideal bit allocation module 410 generates 
ideal/target bit allocations, where {B m (i); m e Af} is the ideal/nominal CBR bit allocation for 
pictures of type m prior to encoding picture i for each picture type, according to the equation: 

where 2?(z) is the bit budget for the next N pictures, m is the picture type index, W m is a 
parameter indicating the relative weighting for pictures of type m, and N m is the number of 
pictures of type m within a window of TV pictures (usually but not necessarily a GOP). 
Note that the bit allocation is dynamically updated on a per-picture basis using a forward- 
looking rolling window. 

[0049] The target bit allocation algorithm is based on several assumptions. The first 
assumption is that the sum of the bit allocations for each of the pictures must equal the total 
bit budget for all of the pictures. Second, it is assumed that it is desirable to achieve constant 
quality video over all pictures, which implies a single quantizer scale factor for all picture 
types in the rolling window. Finally, it is assumed that the following simplified equation 
between the bit production for each picture type, 2?„, the quantizer scale factor, Q, and the 
weighted complexity, X n /W n applies: 

W .Q 

[0050] VBV fullness adjustment module 420 receives the target/ideal bit allocation 
and also a signal indicative of VBV{i-\) where VBV(i) is the VBV fullness after encoding 

9 

picture i. The CBR bit allocation module 330 strives to achieve an "ideal" VBV(i) fullness N 
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pictures into the future. This "ideal" VBV buffer fullness, VBV^ii), represents the steady- 
state fullness of the VBV buffer under the assumption that the encoder is allocating and 
generating bits in accordance with the specified CBR bit-rate and in accordance with the 
target bit allocation model using the nominal or "ideal" bit budget of: 
B(i)=NB peaky 

where B peak {i) = R peak /F . (Note that for CBR, the peak and average rates are equal.) 

[0051] The effect of N is such that as N increases, the algorithm reacts more slowly to 
deviations from the nominal or "ideal" VBV buffer fullness. A larger N provides a greater 
opportunity for constant quality video, but also a greater risk for VBV underflow. If an ideal 
VBV fullness is specified immediately prior to encoding the first I-frame, the ideal VBV 
fullness, VBV ideal (i) 9 can be determined for all remaining pictures in the rolling window using 
the ideal IPB bit allocation. Thus, it is desirable to assign a relatively full buffer just prior to 
encoding the I-frame, since I-frames usually generate the most bits per picture. Also, note 
that VBV^fyis shift invariant for multiples of N, i.e., 

VBV ideal {i + N)=VBV„ eal {i). 

[0052] In order to achieve ideal VBV fullness N pictures into the future, VBV 
fullness adjustment module 420 may adjust the nominal total bit budget, B(i), up or down 
based on the difference between the actual VBV fullness, VBV{i\ and the ideal fullness, 
VBV ideal (i), according to the formula: 

B'ii) = N • B peak - VBV^i - l)+ VBV(i - 1). 

[0053] Assuming the bits are produced in accordance with the bit allocation model, 
the above equation ensures that ideal VBV fullness will be achieved in N pictures. Based 
onZ?'(0> the rate control creates a bit allocation, {B' m (i); m € M } based on the modified bit 
budget. 

[0054] VBV compliance check module 430 employs an iterative VBV compliance 
check that will reduce the proposed total bit budget, £'(0 , to prevent VBV underflow if VBV 
underflow is predicted to occur. The compliance process accomplishes this task by 
predicting the future path of VBV fullness, VBV predict (i), for the next N pictures based on an 

assumption that the encoder generates bits in a predetermined way. For the most part, the 
CBR algorithm assumes bits will be produced in accordance with the proposed bit allocation. 
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However, there are two exceptions. Specifically, the rate control assumes bits are generated 
according to: 

B predict (i) = r(i)'B' m Xrr) > 

for / = icurrS-Jcvrr + N - 1 , where m,- is the picture type for picture i, icurr is the index of the 

current picture, and y(i) is a scaling factor given by: 



y{i) = 



curr 



X(i)/X mi (i-l), if i = i c 

1 .0, otherwise 



The compliance algorithm initializes VBV preJict (i) to: 

and updates it according to: 

VBV predict (i) = VBV predicl (i - l) + B peak - B predict (i) . 

[00551 If at some point VBV predict {i) drops below a specified minimum threshold 
VBV min , the algorithm reduces the bit allocation based on the following update procedure: 

^ B predicM) 
i=i curr 

where i err is the future picture index predicted to cause VBV underflow. Once the total bit 
budget is reduced, the CBR algorithm repeats the VBV compliance check using the reduced 
bit allocation, B"(i). The equation for5"(/) is derived by imposing the constraint that 
VBV predict {i err ) will equal VBV^ . 

[0056J The output 10 of VBV compliance adjustment is a bit budget, B n (i) 9 and 
corresponding bit allocation, {B" m {i)\ meM}, that is predicted to avoid VBV underflow over 

the next N pictures. Using this bit allocation, quantization module 240 may generate a target 
quantizer step size for the current picture using formula: 

n (\-EmM 

where is the picture type of picture i. 

[0057] Prior to encoding, the target quantizer scale for each macroblock, which is 
nominally a real value, must be converted to an integer for compatibility with MPEG. A 
dithering algorithm may perform a translation at a specified update rate. 
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[0058] Referring to Figure 5, in one embodiment programmable rate controller 260 
includes a CBR rate controller 290 and a core VBR rate controller 280. The CBR rate 
controller 290 and the core VBR rate controller 280 make independent calculations of bit rate 
and a selection module 510 selects the maximum of the two bit rates. The final VBR 
quantizer scale, Qvbr, is selected as the larger of the two proposed values: 

Q — J Q yBR l ^ QyBR > QcBR 

[Qcbr otherwise 

[0059] In one embodiment, the core VBR rate controller 280 creates a target bit 
allocation, B mR (j), for each picture by tracking the long-term average bit-rate. A variety of 

tracking techniques may be used. These may include, for example, filters to filter out short 
term deviations in bit rate while permitting the long-term average bit-rate to vary slowly with 
respect to subsequent pictures within a group of pictures. One suitable tracking technique is 
to use proportional integral control techniques to select a response that is selectable by 
inputting a time constant that determines the nature of the response. 

[0060] Figure 6 is a block diagram illustrating a model of the core VBR rate 
controller 280 having a second-order Proportional-Integral (PI) controller to track the long- 
term average bit rate. A difference in bit rate, B de ita between the average bit rate, B avg? and the 
actual bit rate, B ac tuai is used as an input to adjust the long-term average bit rate. The target 
VBR bit rate is given by: 

B VBR (?) = B nR (i -l)+K p - B deUa (i) + K t ■ Aii) , 

where B delta (i) represents the instantaneous bit-rate deviation and is given by: 

B delta (0 = B avg ~ B actual 0' ~ 0 > 

where A(i) represents the cumulative bit-rate deviation and is given by: 

A(i)=A(i-l)+B delta (i). 
and where B actuaJ (/) is the actual bits generated by the encoder for picture i and B avg is Ravg/F . 

[0061] These update equations result in an open-loop transfer function given by: 

(l-z-'J 

[0062] The target bit allocation is used to derive the proposed quantizer scale for the 
core VBR algorithm, , according to the formula: 
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where X 0 represents a nominal measure of complexity given by X 0 = Q 0 * B avg with Q 0 
corresponding to the initial desired quantizer scale. This equation models the inverse 
relationship between the quantizer selection and the output bits produced by quantization and 
variable length encoding. 

[0063] As illustrated in Figure 7, for the case that the rate-quantization model 
(Xo/Bysx) is accurate, then the Xq/B^ term and the quantization and variable length 

encoding blocks cancel each other out, i.e.,2? ra/? (/)« B actual (i), and the VBR rate controller 
model reduces to the traditional linear feedback control system with a feedback transfer 
function, H{z), given by: 

B avg {z) l + G(z) (l + K i+ K p )-(2 + K p ) z-*+z- 2 ' 
For a first-order system, AT, = 0 and K p can be determined directly from the desired time 
constant t according to: 
K p =e T "-l. 

For second-order control, we simply replicate the pole (p = 2- e T/T ) from the first order 
system, resulting in PI coefficients given by: 



K,=2 



\ p ) 



and 



4 

[0064] Using a bilinear transformation, the feedback transfer function, //(z) can be 
mapped into the Laplace domain and equate the resulting transfer function with the closed 
loop transfer function of a traditional analog second-order PI control system (with the 
additional assumption that the sample rate is quite large compared to the frequency range of 
interest). This mapping permits computation of estimates for traditional linear control system 
parameters such as the undamped natural frequency, co n , the time constant, r , the damping 
factor, £ , given by: 
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^~ 2^ ' 

and 

1 

r = . 

[0065] One benefit of the rate controller of the present invention is that it may be used 
for real time applications requiring a small time delay. In particular, the rate encoder may be 
implemented as a computationally efficient single pass rate encoder, i.e., a rate encoder not 
requiring multiple iterations of data to estimate the complexity of a picture with a sufficiently 
high accuracy to avoid VBV overflows and underflows. 

[0066] Another benefit of the present invention is that the parametric constraints may 
be set for the needs of particular applications to achieve flexible tradeoffs between rate and 
quality. R pea k and B vbv are set to ensure a VBV compliant output bitstream. For many 
applications, these values determine what classes of decoders are guaranteed to play the 
bitstream. For example, the maximum compatible values for DVD correspond to R pea k = 9.8 
Mbs and B vbv = 1,835,008 bits. Likewise, the constrained parameter limitations for MPEG-1 
are R peak = 1 .856 Mbs and B vbv = 327,680 bits. 

[0067] As one example, the rate controller can be set to a constant quality mode to 
provide the highest quality. For this case, Q targ et is set to the desired quality level. R pea k, B vbv 
and a VBV Compliance Flag are set appropriately if VBV compliance is desired. However, 
a drawback of constant quality is that it results in an unpredictable file size. 

[0068] In another mode of operation, the best possible quality is selected for a 
predetermined file size. In one embodiment, the settings for this mode of operation include: 
setting R avg with long term average bit rate goal; setting the time-constant, r, to a large value 
to minimize the effect of short-term bit rate production on quality such that r is preferably 
longer than the longest expected scene of any given complexity; setting Q 0 to an appropriate 
initial value for the VBR algorithm; setting Q min to an appropriate value so the encoder will 
not overly produce bits for simple content to maintain the average bit rate goal; setting Q max 
to an appropriate value so the encoder will not overly quantize complex scenes to maintain 
the average bit rate goal; and setting R pea k, B vbv and setting a VBV Compliance Flag to enable 
VBV compliance. An advantage of this mode of operation is that it provides the best 
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possible quality for a predetermined file size. However, it has the drawback that medium and 
high complexity scenes will end up with the same number of bits if the scene length is longer 
than the specified VBR time-constant. 

[0069] Another mode of operation is a true CBR video mode. The settings for this 
mode correspond to setting R peak to a desired bit-rate; setting the constant Rate Flag to true; 
and setting B v b v and the VBV Compliance Flag appropriately for VBV compliance. An 
advantage of this mode is that it provides true CBR video that can be written to VCD. 
However, a drawback is that the video quality is lower compared to other modes. As an 
example of CBR video mode for a VCD, the settings may be set to R avg = R pea k = 115 Mbs 

and ByBy = 327,680 bits and setting a constant rate flag to be true. 

R 

[0070] Consider the example of burning a compact disk with peak = 9.8 Mbs and 
b vbv = 1,835,008 bits. R avg need not equal R peak and is determined by the storage capacity of 

the medium and the duration of the source content, and a compromise in rate must be made to 
fit on the disk at the expense of perfect video quality. A constant rate flag is set to false for 
this case. For this case , freedom exists to specify Q min , but not Q max (since it may prevent 

the rate control algorithm from achieving R avg ). 

[0071] For a personal video recorder (PVR) the constraints on R avg are not tight, 

assuming a large hard drive memory storage capacity for storing compressed MPEG files. 
More freedom exists to choose R avg , R peak and B^y . Freedom exists to specify Q min and 

Qmax since the constraint on R avg is soft. For a large hard drive, quality effectively trumps 

rate, i.e., it is probably better to exceed R avg instead of degrading the video quality. 

[0072] Figure 8 is an exemplary plot of quantizer step size for dual mode VBR/CBR 
operation. In this example, the encoder operates in CBR mode and shifts mode to VBR for 
more complex scenes. 

[0073] Thus, from these examples it will be understood that a video compression 
encoder of the present invention is particularly beneficial for applications where a video 
compression encoder is used for applications having different constraints, such as burning a 
CD, PVR, etc... 

[0074] It will be understood that programmable rate controller 260 and encoder 200 
may be implemented in hardware, software, firmware, or combination thereof. 
Consequently, a software embodiment of the present invention relates to a computer storage 
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product with a computer-readable medium having computer code thereon for performing 
various computer-implemented operations. The media and computer code may be those 
specially designed and constructed for the purposes of the present invention, or they may be 
of the kind well known and available to those having skill in the computer software arts. 
Examples of computer-readable media include, but are not limited to: magnetic media such as 
hard disks, floppy disks, and magnetic tape; optical media such as CD-ROMs and 
holographic devices; magneto-optical media such as floptical disks; and hardware devices 
that are specially configured to store and execute program code, such as application-specific 
integrated circuits ("ASICs"), programmable logic devices ("PLDs") and ROM and RAM 
devices. Examples of computer code include machine code, such as produced by a compiler, 
and files containing higher-level code that are executed by a computer using an interpreter. 
For example, an embodiment of the invention may be implemented using Java, C++, or other 
object-oriented programming language and development tools. Another embodiment of the 
invention may be implemented in hardwired circuitry in place of, or in combination with, 
machine-executable software instructions. 

[0075] The foregoing description, for purposes of explanation, used specific 
nomenclature to provide a thorough understanding of the invention. However, it will be 
apparent to one skilled in the art that specific details are not required in order to practice the 
invention. Thus, the foregoing descriptions of specific embodiments of the invention are 
presented for purposes of illustration and description. They are not intended to be exhaustive 
or to limit the invention to the precise forms disclosed; obviously, many modifications and 
variations are possible in view of the above teachings. The embodiments were chosen and 
described in order to best explain the principles of the invention and its practical applications, 
they thereby enable others skilled in the art to best utilize the invention and various 
embodiments with various modifications as are suited to the particular use contemplated. It 
is intended that the following claims and their equivalents define the scope of the invention. 
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Appendix 1 : Table of exemplary programmable controller parameters. 



R avg 


Ravg is the target average bit rate of the output bitstream in bits/sec. However, the value is 
related to Q. For example, if Q min is specified, R avg represents an upper bound on the 

average rate, i.e., a peak average rate over the window specified. If Q max is specified, the 

encoder may not be able to reduce the bitrate sufficiently to achieve R avg for some content. 


R 


The maximum bit rate specified in the sequence header of the bitstream used by the video 
bitstream verification (VBV) model. 


T 


The time duration in msec over which the VBR rate controller reacts to deviations in average 
bit rate. 


&VBV 


The size of the VBV buffer in bits and the implicit value for the peak rate window size. 


Qtarget 


The target quantizer scale for all macroblocks used by the VBR rate controller. The rate 
controller may override Q target to prevent VBV underflow. 


Qo 


The initial quantizer scale value for the VBR rate control algorithm. 




A lower bound on the target VBR quantizer scale value for a picture. However, 

cptti r» o t ni c \rck 1 1 1 p rnav nrpvpnt tfif* PfirnHfr from ^phipvino tVif* c*\rf*m<jf* raff* cnpri "fi <=»r1 
owning lino vaiu& may ^itvcin uic diisUU-d xiuiii cic>ijj.cviii^ uic avci age I die oUCC-lllCLi 

byR avg . 


Q 


An upper bound on the target VBR quantizer scale value for a picture. 

Note: setting this value may prevent the encoder from achieving the average rate 

specified by R avg . If VBV Compliance Flag is set, the rate control may override 

Qmax to prevent VBV underflow. 




The relative weighting for the bit allocation of P and B pictures as compared to I 
pictures (where Wj is implicitly 1.0). Typical values are 1.0 for W P and 1.4 for W B . 
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Appendix 2: Summary table of equation symbol definitions. 



Symbol 


Definition 


K 


Set of macroblock types 


M 


Set of picture types (I, P, or B) 


J 


Set of macroblock indices in a picture 


i 


Usually the picture index 


j 


Usually the macroblock index 


k 


Usually the macroblock type index 


tn, n 


Usually the picture type index 


B peak 


Maximum average bits per picture 


F 


Picture rate (1/7) 


T 


Picture period (l/F) 




P x y (i, j ) Luminance value of the pixel corresponding to row x and 
column j> of macroblock j in the original input picture i 




Luminance value of the pixel corresponding to row x and column y 
of macroblock j in the difference image resulting from the motion 
compensation of picture i 


biij) 


The number of quantization-dependent bits generated from 

f^nc.oHiniy maprnhlnpk' / in rnrtnrf* / 

wnvv/w-nig nielli uuiut- rv / in ui^iui^ L 




Average complexity of macroblocks of type k in picture i 




Estimated complexity for macroblocks of type k after encoding 
picture i 


X(i) 


Complexity of picture j 


X(i) 


Predicted complexity for picture / (using energy scale factor) 




Estimated complexity for pictures of type m after encoding picture i 




Relative bit allocation weighting factor for pictures of type m 


e kV) 


Average energy of macroblocks of type k in picture i 




Estimated energy for macroblocks of type k after encoding picture i 




Intra energy of picture i 


E intra(f) 


Estimate intra energy for pictures of type m after encoding picture i 




The number of macroblocks of type k in picture i 




Normalizing factor used to calculate e k {i) and x k (i) 


r 4 (0 


The fraction of macroblocks of type k in picture i 


T (A 

L k,m\ l ) 


i_/&niiiaicu ii dt ii uii ui iiid.uruuiucKs> oi lypc k occurring in pictures oi 
type m after encoding picture i 




iLlbai/llullllllal v^J3IV Ull dllOL/d-LlUIl 1UI lUlllIlg W1I1C1UW OI IN piCLUreS 

based on the relative complexity of I, P, and B pictures prior to 
encoding picture i 




Initial target CBR bit allocation for rolling window of N pictures 
prior to VBV compliance adjustment prior to encoding picture i 


B'(i) 


Final Target CBR bit allocation for rolling window of N pictures 
after VBV compliance adjustment prior to encoding picture i 


B m ii) 


Ideal/nominal CBR bit allocation for pictures of type m prior to 
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encoding picture /' 


1 r»/ /A 


Initial target CBR bit allocation prior to VBV compliance 
adjustment for pictures of type m prior to encoding picture i 


nff / a 


Target CBR bit allocation after VBV compliance adjustment for 
pictures of type m prior to encoding picture i 




VBV tullness after encoding picture i 




Ideal VBV fullness after encoding picture i if rate-quant model is 
accurate and in steady-state 


~tm ( A 


Predicted VBV fullness after encoding picture i based on the target 
CBR bit allocation 


QcbrQ) 


The CBR picture-level quantizer scale value for encoding picture i 




The preliminary VBR picture-level quantizer scale value for 
encoding picture i that does not guarantee VBV compliance 


Qvbr(}) 


The VBR picture-level quantizer scale value for encoding picture i 




The quantizer scale value for encoding macroblock j in picture i 


T 


Time constant for the VBR algorithm 


K K 


Filter coefficients for the VRR PT feedback" lonn 


a{i) 


Picture-level aging parameter used to calculate E intra {i) and T k (/) 


«*(*) 


Macroblock-level aging parameter used to calculate e k (i) 9 x k (i), 
andO.O) 
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Appendix 3: Summary table of exemplary signals at different points in the rate controller. 



Signal 


Input/Output Parameters 


1 


• {P xy {i 9 j);jeJ^P xy {i,j) are Luminance values of each pixel 

corresponding to row x and column y of macroblock j in the 
original input picture i, where J is the set of macroblock indices in 
a picture 


2 


• {^x,y(^j)'y J G ^\ 816 the luminance value of the pixel 
corresponding to row x and column y of macroblock j in the 
difference image resulting from the motion compensation of 
picture i 


3 


• Macroblock coding decisions for picture i 


4 


• Qcbr^) The CBR picture-level quantizer scale value for encoding 
picture i 


5 


• {b(i 9 j); j g j) , are the number of quantization-dependent bits 
generated from encoding macroblock j in picture i 

• j ^J} 31:6 the corresponding quantizer scale value for 
encoding macroblock j in picture i 


6 


• VBV{i - 1) w here fullness after encoding picture i 


7 


• fo h {i)\ k g K} is The number of macrob locks of type k in picture 

i 

• {<& k (i); k e k] Normalizing factor used to calculate e k (i) and 

• {l* (0 ! * G ^} The fraction of macroblocks of type k in picture i 

• \Fm,k V )l k e K 9 meMf is the Estimated fraction of macroblocks 
of type k occurring in pictures of type m after encoding picture i 

• {e k {i)l k g K] is the Average energy of macroblocks of type k in 
picture i 

• (ejtO) ; k g AT} is the Estimated energy for macroblocks of type k 
after encoding picture i 


8 


• Eintrai*) * s Intra energy of picture i 

• E intra {i)is the Estimate intra energy for pictures of type m after 
encoding picture i 


9 


• {X m (i); meAf} ; where is the Estimated complexity for pictures 

of type m after encoding picture j and M is the Set of picture 
types (I, P, or B) 

• • t i__ Predicted complexity for picture i (using energy scale 

V / JVj lilt 

factor) 


10 


• B" mi (/) , where Target CBR bit allocation after VBV compliance 
adjustment for pictures of type m prior to encoding picture i 

• X {i) is the Estimated complexity for pictures of type m after 
encoding picture i 
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